AITopics | La Plata County

Collaborating Authors

La Plata County

SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

Zhang, Jintao, wei, Jia, Huang, Haofeng, Zhang, Pengle, Zhu, Jun, Chen, Jianfei

arXiv.org Artificial IntelligenceDec-24-2024

When handling large sequence lengths, attention becomes the primary time-consuming component. Although quantization has proven to be an effective method for accelerating model inference, existing quantization methods primarily focus on optimizing the linear layer. In response, we first analyze the feasibility of quantization in attention detailedly. Following that, we propose SageAttention, a highly efficient and accurate quantization method for attention. The OPS (operations per second) of our approach outperforms FlashAttention2 and xformers by about 2.1x and 2.7x, respectively. SageAttention also achieves superior accuracy performance over FlashAttention3. Comprehensive experiments confirm that our approach incurs almost no end-to-end metrics loss across diverse models--including those for large language processing, image generation, and video generation. Attention is the fundamental component of transformers (Vaswani, 2017), and efficiently computing attention is crucial for transformer-based applications. Moreover, there is a recent trend in processing longer sequences, which further strengthens the need for faster attention.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.02367

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Colorado > La Plata County > Durango (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

WavePulse: Real-time Content Analytics of Radio Livestreams

Mittal, Govind, Gupta, Sarthak, Wagle, Shruti, Chopra, Chirag, DeMattee, Anthony J, Memon, Nasir, Ahamad, Mustaque, Hegde, Chinmay

arXiv.org Artificial IntelligenceDec-23-2024

Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally applicable, we showcase the efficacy of WavePulse in a collaborative project with a team of political scientists focusing on the 2024 Presidential Elections. We use WavePulse to monitor livestreams of 396 news radio stations over a period of three months, processing close to 500,000 hours of audio streams. These streams were converted into time-stamped, diarized transcripts and analyzed to track answer key political science questions at both the national and state levels. Our analysis revealed how local issues interacted with national trends, providing insights into information flow. Our results demonstrate WavePulse's efficacy in capturing and analyzing content from radio livestreams sourced from the Web. Code and dataset can be accessed at \url{https://wave-pulse.io}.

large language model, machine learning, real time system, (23 more...)

arXiv.org Artificial Intelligence

2412.17998

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > New York > Kings County > New York City (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(215 more...)

Genre: Research Report > New Finding (0.86)

Industry:

Media > Radio (1.00)
Leisure & Entertainment (1.00)
Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(5 more...)

Add feedback

Hydra-LSTM: A semi-shared Machine Learning architecture for prediction across Watersheds

Ruparell, Karan, Marks, Robert J., Wood, Andy, Hunt, Kieran M. R., Cloke, Hannah L., Prudhomme, Christel, Pappenberger, Florian, Chantry, Matthew

arXiv.org Artificial IntelligenceOct-21-2024

Long Short Term Memory networks (LSTMs) are used to build single models that predict river discharge across many catchments. These models offer greater accuracy than models trained on each catchment independently if using the same data. However, the same data is rarely available for all catchments. This prevents the use of variables available only in some catchments, such as historic river discharge or upstream discharge. The only existing method that allows for optional variables requires all variables to be considered in the initial training of the model, limiting its transferability to new catchments. To address this limitation, we develop the Hydra-LSTM. The Hydra-LSTM processes variables used across all catchments and variables used in only some catchments separately to allow general training and use of catchment-specific data in individual catchments. The bulk of the model can be shared across catchments, maintaining the benefits of multi-catchment models to generalise, while also benefitting from the advantages of using bespoke data. We apply this methodology to 1 day-ahead river discharge prediction in the Western US, as next-day river discharge prediction is the first step towards prediction across longer time scales. We obtain state-of-the-art performance, generating more accurate median and quantile predictions than Multi-Catchment and Single-Catchment LSTMs while allowing local forecasters to easily introduce and remove variables from their prediction set. We test the ability of the Hydra-LSTM to incorporate catchment-specific data by introducing historical river discharge as a catchment-specific input, outperforming state-of-the-art models without needing to train an entirely new model.

artificial intelligence, catchment, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.16343

Country:

Europe > United Kingdom > England > Berkshire > Reading (0.04)
North America > United States > Idaho > Ada County > Boise (0.04)
North America > United States > Montana (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government > North America Government > United States Government (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets

Kwon, Oh Joon, Matsunaga, Daiki E., Kim, Kee-Eung

arXiv.org Artificial IntelligenceOct-19-2024

A critical component of the current generation of language models is preference alignment, which aims to precisely control the model's behavior to meet human needs and values. The most notable among such methods is Reinforcement Learning with Human Feedback (RLHF) and its offline variant Direct Preference Optimization (DPO), both of which seek to maximize a reward model based on human preferences. In particular, DPO derives reward signals directly from the offline preference data, but in doing so overfits the reward signals and generates suboptimal responses that may contain human biases in the dataset. In this work, we propose a practical application of a diversity-seeking RL algorithm called GFlowNet-DPO (GDPO) in an offline preference alignment setting to curtail such challenges. Empirical results show GDPO can generate far more diverse responses than the baseline methods that are still relatively aligned with human values in dialog generation and summarization tasks.

machine learning, natural language, password, (16 more...)

arXiv.org Artificial Intelligence

2410.15096

Country:

North America > United States > Maryland > Baltimore (0.04)
North America > United States > Colorado > La Plata County (0.04)
Europe > Germany (0.04)
(4 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Consumer Health (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale

Bao, Fan, Nie, Shen, Xue, Kaiwen, Li, Chongxuan, Pu, Shi, Wang, Yaole, Yue, Gang, Cao, Yue, Su, Hang, Zhu, Jun

arXiv.org Artificial IntelligenceMay-30-2023

This paper proposes a unified diffusion framework (dubbed UniDiffuser) to fit all distributions relevant to a set of multi-modal data in one model. Our key insight is -- learning diffusion models for marginal, conditional, and joint distributions can be unified as predicting the noise in the perturbed data, where the perturbation levels (i.e. timesteps) can be different for different modalities. Inspired by the unified view, UniDiffuser learns all distributions simultaneously with a minimal modification to the original diffusion model -- perturbs data in all modalities instead of a single modality, inputs individual timesteps in different modalities, and predicts the noise of all modalities instead of a single modality. UniDiffuser is parameterized by a transformer for diffusion models to handle input types of different modalities. Implemented on large-scale paired image-text data, UniDiffuser is able to perform image, text, text-to-image, image-to-text, and image-text pair generation by setting proper timesteps without additional overhead. In particular, UniDiffuser is able to produce perceptually realistic samples in all tasks and its quantitative results (e.g., the FID and CLIP score) are not only superior to existing general-purpose models but also comparable to the bespoken models (e.g., Stable Diffusion and DALL-E 2) in representative tasks (e.g., text-to-image generation).

artificial intelligence, machine learning, unidiffuser, (19 more...)

arXiv.org Artificial Intelligence

2303.06555

Country:

Asia > China > Beijing > Beijing (0.05)
Europe > Slovenia (0.04)
Atlantic Ocean (0.04)
(10 more...)

Genre: Research Report (0.64)

Industry: Transportation > Ground > Road (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.86)

Add feedback

Android Things and Machine Learning

#artificialintelligenceApr-21-2017, 17:59:51 GMT

Android Things allows you to make amazing IoT devices with simple code, but one of the things that can make a device extraordinary is machine learning. While there are a few services available online that will allow you to upload data and will return results, being able to use machine learning locally and offline can be incredibly useful. Machine learning can help solve problems that conventional apps cannot. To provide context, let's go through a simple example where machine learning can be used with an IoT device to improve daily life. Here in Colorado, it's not uncommon to see news articles about wildlife coming out from the mountains and walking around a downtown: I've even had a friend post video of a bear outside of their home!

artificial intelligence, machine learning, tensorflow, (13 more...)

#artificialintelligence

Country: North America > United States > Colorado > La Plata County > Durango (0.05)

Industry:

Information Technology (0.31)
Government (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Deep Learning Sheds New Light on an Ancient Mystery According to Robert G. Cathcart

#artificialintelligenceSep-13-2016, 10:15:37 GMT

Still think the Great Pyramid is an ancient tomb? Some experts in deep learning may beg to differ. They have discovered that a particular application of deep learning methodology applied to the ancient structures on the Giza Plateau, including the Great Pyramid, may in fact show that it is a three dimensional process diagram. "We were using a deep learning methodology that helps us identify and categorize unknown symbols and processes. We thought it would be fun to use it on something everyone knows. We were amazed to find that the Giza Plateau may in fact be a three dimensional process diagram."

artificial intelligence, cathcart, machine learning, (10 more...)

#artificialintelligence

Country:

Africa > Middle East > Egypt > Giza Governorate > Giza (0.52)
North America > United States > Colorado > La Plata County > Durango (0.08)

Genre: Press Release (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing

AAAI ConferencesAug-8-2011

Crowdsourcing is an effective tool for scalable data annotation in both research and enterprise contexts. Due to crowdsourcing’s open participation model, quality assurance is critical to the success of any project. Present methods rely on EM-style post-processing or manual annotation of large gold standard sets. In this paper we present an automated quality assurance process that is inexpensive and scalable. Our novel process relies on programmatic gold creation to provide targeted training feedback to workers and to prevent common scamming scenarios. We find that it decreases the amount of manual work required to manage crowdsourced labor while improving the overall quality of the results.

artificial intelligence, gold unit, social media, (15 more...)

AAAI Conferences

Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.15)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > North Carolina > Pitt County > Greenville (0.04)
(3 more...)

Industry: Materials > Metals & Mining > Gold (0.79)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback